TEL-AVIV UNIVERSITY RAYMOND AND BEVERLY SACKLER FACULTY OF EXACT SCIENCES SCHOOL OF COMPUTER SCIENCE Decision Trees: More Theoretical Justification for Practical Algorithms
نویسنده
چکیده
We study impurity-based decision tree algorithms such as CART, C4.5, etc., so as to better understand their theoretical underpinnings. We consider such algorithms on special forms of functions and distributions. We deal with the uniform distribution and functions that can be described as a boolean linear threshold functions and a read-once DNF. We show that for boolean linear threshold functions and read-once DNF, maximal purity gain and maximal influence are logically equivalent. This leads us to the exact identification of these classes of functions by impurity-based algorithms given sufficiently many noise-free examples. We show that the decision tree resulting from these algorithms has the minimal size amongst all decision trees representing the function. Based on the statistical query learning model, we introduce the noise-tolerant version of practical decision tree algorithms. We show that when the input examples have small classification noise and are uniformly distributed, then all our results for practical noise-free impurity-based algorithms also hold for their noise-tolerant version.
منابع مشابه
Determining the configuration of macromolecular assembly components based on cryoEM density fitting and pairwise geometric complementarity
School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel Department of Biopharmaceutical Sciences and Pharmaceutical Chemistry, and California Institute for Quantitative Biomedical Research, University of California at San Francisco, San Francisco, CA 94158, USA School of Crystallography, Birkbeck College University of London...
متن کاملHow and why hyperbaric oxygen therapy can bring new hope for children suffering from cerebral palsy--an editorial perspective.
1 The Institute of Hyperbaric Medicine, Assaf Harofeh Medical Center, Zerifin, Israel 2 Research and Development Unit, Assaf Harofeh Medical Center, Zerifin, Israel 3 Sackler School of Medicine, Tel-Aviv University, Tel-Aviv, Israel 4 Sagol School of Neuroscience, Tel-Aviv University, Tel-Aviv, Israel 5 The Raymond and Beverly Sackler Faculty of Exact Sciences, School of Physics and Astronomy, ...
متن کاملTEL-AVIV UNIVERSITY RAYMOND AND BEVERLY SACKLER FACULTY OF EXACT SCIENCES SCHOOL OF MATHEMATICAL SCIENCES Extremal Polygon Containment Problems and Other Issues in Parametric Searching
4
متن کاملHEMOSTASIS, THROMBOSIS, AND VASCULAR BIOLOGY Reduced incidence of ischemic stroke in patients with severe factor XI deficiency
1The Amalia Biron Research Institute of Thrombosis and Hemostasis, Sheba Medical Center, Tel Hashomer and Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv; 2Department of Statistics and Operations Research, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv; 3Division of Epidemiology and Preventive Medicine, Sackler Faculty of Medicine, Tel Aviv Unive...
متن کاملThe Algorithmic Aspects of the Regularity Lemma
The Regularity Lemma of Szemerédi is a result that asserts that every graph can be partitioned in a certain regular way. This result has numerous applications, but its known proof is not algorithmic. Here we first demonstrate the computational difficulty of finding a regular partition; we show that deciding if a given partition of an input graph satisfies the properties guaranteed by the lemma ...
متن کامل